Stochastic analysis of an agent-based model 
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We analyze the dynamics of a forecasting game which exhibits the phenomenon of information 
cascades. Each agent aims at correctly predicting a binary variable and he/she can either look for 
independent information or herd on the choice of others. We show that dynamics can be analitically 
described in terms of a Langevin equation and its collective behavior is described by the solution of 
a Kramers' problem. This provides very accurate results in the region where the vast majority of 
agents herd, which corresponds to the most interesting one from a game theoretic point of view. 
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I. INTRODUCTION 



Information aggregation plays an essential role in subsistence, evolution and self-organization of living beings. The 
selective advantages of sexual over asexual reproduction relies partly on the higher efficiency of the first to aggregate 
genetic information of fitter individuals Fishes, birds or mammals join together in shoals to gather from each other 
information on food or predators placement Q. In human behaviour, emergence of fashion, fads, social consensus 
are expression of (attempts of) information aggregation. In economics, markets have been celebrated as having the 
virtue of correctly aggregating dispersed information into prices on a theoretical basis, but the occurrence of crashes 
and bubbles clearly suggests that in practice markets may fail in this respect [3|. 

The failure of information aggregation can arise because of herding, as vividly explained be the phenomenon of 
rational herding and information cascades [J- In brief, these are situations where many individuals aim at gathering 
t-H ■ some information about an event, having access to some private information. If all individuals choose according to 
^ ' their private signals, the choice of the majority correctly aggregates all available information. This means that the 
choice of the majority conveys much more precise information than the one each agent receives and hence, if agents 
can observe the choice of others, it becomes profitable to follow the majority's choice. The problem is that if all agents 
do this, choices no longer reflect private information, and information aggregation miserably fails. 

The emergence of large fluctuations in socio-economic phenomena such as social consensus Q , herding Q , bubbles 
and crashes in financial markets 0, 0] can be conveniently formalized in simple Ising like models. This approach 
reveals clearly the role which (spontaneous) symmetry breaking and long range order have in these phenomena. It 
also allows, in principle, to derive a precise description of the dynamical process taking place. 

In this paper, we study in details a simple model where information cascades arises in a system of spins interacting 
on a directed graph and which is intimately connected to phase coexistence Q. We will show that its dynamics can 
be analytically described in terms of a Langevin equation and that collective behavior is related to the solution of a 
Kramers' problem. This approach yields excellent results in the most peculiar region of the phase diagram. 



II. THE MODEL 



The model describes a population of agents engaged in the task of predicting a binary variable E = ±1 (e.g. whether 
something will occur or not). Each agent has two possible strategies, either to look independently for information on 
the event, or to infer information from their peers. The first option delivers a signal which predict the correct outcome 
with a probability p > 1/2 whereas the second option, i.e. if the agent decides to herd, provides that he or she will have 
access to the predictions of a group of K > 1 other agents. While informed agents will follow the recommendation of 
the signal they receive, herders will enter into an information exchange stage, where their predictions will gradually 
converge to those of the majority of their peer group. The key issue, depending on how many agents will decide 
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to herd, is what the majority of the population will do and, given this, what is the Nash equilibrium strategy, i.e. 
that which maximizes the probability of predicting the correct outcome. When most agents look for independent 
information, herding is the best move as it delivers the aggregate information of more than one agent. However, if 
all agents herd, no one will look for information suggesting the population will not be able to predict the correct 
outcome. 

In order to model such a situation, Ref. (§| considers a system of N agents, each characterized by a spin variable 
Si, taking the value Sj = +1 in the case of a correct prediction and s, = — 1 otherwise. The information exchange 
process can be described by the following asynchronous update process: At each discrete time n = 0, 1, .. . an agent i 
is picked at random and his/her spin is updated as 



Si(n + 1) = sign 



N 



(1) 



whereas Sj(n + 1) = Sj(n) for all j ^ i. We denote with i G I informed agents and with i e iJ herders. The difference 
in the two strategies is encoded in the values of hi and ciij. The former describes the signals received by informed 
agents 

-1 with probability p, for i G I 
hi = ^ —1 with probability 1 — p, for i G I (2) 
ieH 

wheres a^j describes the network through which herders collect their information. Therefore, if Gi is the group of 
agents that agent i G H copies, 

1 ieH, j G G, . . 

else w 

Each herder i G H has \Gi\ = K = Y]j a^j peers in his/her group which are chosen at random among the N — 1 other 
agents [ll|. We shall assume also that K is odd, so that Eq. (fT]) always provides a well defined recommendation. 
Summarizing, the first term in the braces of Eq. (TTJ) is non-zero only for i £ I whereas the second is non-zero only for 
i G H. All spins are initially assigned a random value Sj(0) = ±1 with equal probability. 

Fig. Q] shows the result of numerical simulations for different systems sizes ./V and different fractions 

of herding agents. The dynamics converges to a fixed point and we denote simply by the value the i th spin attains. 
Fig. [1] reports the probability of a right forecast of a randomly chosen agent i G H, which is given by 

Each point represents a single realization. 

While in the left part of the figure q values are spread around a single value, on the right they are polarized onto 
two values, q ~ (almost certainly wrong forecast) and q ~ 1 (almost certainly right forecast). The average (q) of 
q over different realizations (see Fig. [2]) exhibits a non-monotonic behavior with r\ and a strong dependence on the 
system size N for rj close to 1. 

One particularly relevant point r/* is that for which (q) = p, i.e. where the probability of success is independent 
of the choice of the strategy. For rj > rf informed agents perform better than herders. It is reasonable that some 
herders would switch strategy, thus decreasing rj. Likewise, for rj < r/* informed agents have incentives to switch to 
herding, thus suggesting an increase in rj. Therefore we expect the population should converge to a fraction rf of 
herders, which can be thought as a Nash equilibrium. 

III. MEAN FIELD IN THE STATIONARY STATE 

Fig. [T] suggests that the system exhibits a phase transition around r\ ~ 1/2. In order to investigate this behavior, 
Ref. [8j suggested a self-consistent approach. First of all, let us notice that the probability tt for a generic agent 
(whether informed or herding) making the right forecast is 

7T = p(fi = e | i g i u H) = (i - n)p + m (6) 
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Figure 1: Probabilities q of a herding agent making the right forecast at different fractions r\ of herding agents for p — 0.55, 
K — 11 and N = 200 (+), 10 3 (x), 10 4 (*). Full lines represent solutions of q = Q,k(it) as in (JTJl: unstable solution q u is plotted 
through a thiner line. 

Besides, the probability q can be viewed as the sum of probabilities that, for i £ iJ, every combination of majority in 
Gi gives rise to the right forecast, i.e. 



Since ([6]) holds, (UJ represents an implicit equation in q, whose solutions are plotted with full lines in Fig. [T] for 
comparison with numerical results. Eq. {7} admits one solution for rj < r) c , while three for rj > rj c . Individual 
outcomes of the game cluster around the upper solution q + of Eq. ([7]) for r\ < rj Cl whereas some realizations also 
converges to the lower solution g_ for rj > rj c . No realization converges to the intermediate solution q u for rj > r\ c . 
This suggests that the upper and lower branches of the full lines correspond to two stable solutions (q+ and q~), 
whereas the middle one q u corresponds to an unstable solution which separates the basin of attraction of q+ and . 
This hypothesis is confirmed by the detailed analysis of the dynamics discussed in next section. 



In order to understand the behavior of the average outcome (q) shown in Fig. [21 it is necessary to compute the 
probability that the system will converge to one or the other fixed point q± for rj > rj c . Indeed we can write 



where p- is the probability that a realization of the system converges to the fixed point g_. Eq. ||5J) represents 
an approximation for large N since, as shown in Fig. [TJ fixed points values deviate from q± solutions for finite N. 
The dispersion of fixed points is however very small for rj close to one, which is the relevant region where the Nash 
equilibrium is located. 

While g_ and q+ are given by the solution of Eq. l[7|). in order to derive an expression for p_ we will model 
the dynamics of q as the information exchange process takes place. As we shall see, q can be considered as the 
coordinate of a particle subject to a one dimensional double well potential, with a minimum in q + and one in g_, 
and submitted to a stochastic dynamics. These types of problems, which are known under the name of Kramers' 
problems (e.g., [l(|), are well known in the theory of stochastic processes. In order to apply these results, we shall 
first derive a continuous time dynamics for q{t) (Langevin equation) and then use the backward Fokker-Planck 
equation to evaluate the probability p-(qo) that a single realization of the system, starting from an initial "position" 
go and evolving according to a stochastic dynamics, falls in the stable solution g_. 




(7) 



IV. STOCHASTIC DYNAMICS 



(?) =p-q- + (i -p-)v+ 



(8) 
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We stress that we denote by p_(go) the probability that a single realization of the system, with initial condition go 
falls in g_ to distinguish it from 

P- = dq Q p-(qo) ft(?o)> (9) 
Jo 

i.e. from the probability that a generic realization falls in g„, which is obtained taking the average over the distribution 
Ol(qa) of go- For each realization of the system, the initial value go depends on the initial draw of the spins Si(0) and 
hence, by the central limit theorem, the probability distribution of go is well approximated by a Gaussian with mean 
1/2 and variance l/(niV), i.e. 



ft( go ) = wM e -2^( go -i) 2 (10) 



In order to derive a continuum time description of the problem, we introduce a continuous time r s.t. r 



n N 1 



and consider the variation dq = g(r + dr) — g(r) over a time interval dr = Ail. In a single time step, we have 
q(n + 1) = q(n) + [si(n)( n + 1) — s i(n)( n )] > where i(n) stands for the selected herding agent. Summing this over 
n G [rjNT,rjN(T + dr)) and using Eq. ([I]) yields 

q(r + dr) = g(r) + — — ^ (sign ^ Sj (n) - s i{n) {n)\ . (11) 

For 3> 1 we can have rfr < 1, as appropriate for deriving continuum time equations, but with An 3> 1, which 
allows us to apply the CLT to estimate the right hand side. In order to do this, we regard Sj(n) as random variables 
with distribution 

P(si(n) = +l\i e H) = g(r), n 6 [tjNt^Nt + An) 

if i £ H, whereas P(sj(n) = +l\i € I) = p. This allows us to estimate the mean and the variance of the right hand 
side of Eq. (JTTJ) . Therefore, for N ^ 1 the dynamics is well aproximated by the Langevin equation 



dq(r) 
dr 



b[q(T)]+ay/a[q(r)] £(r) (12) 



where £(r) is a white noise term with (£(t)) = and (^(t)C( t ')) = ^( T — T ') an< i whose drift and diffusion terms are 
given by 

%(r)] - n K (n)-q(r) (13) 
a[q(T)} = SI k (tt) [1 - Q K (n)] + g(r) [l - g(r)] (14) 
a 2 = l/(r]N) (15) 

In passing, it is worth to remark that q u is indeed an unstable fixed point of the deterministic dynamics, while q± 
are stable. In fact, denoting the first derivative of b[q] in q by b'[q] and the generic dynamics' fixed point by q* , it 
is easy to show that b'[q*] > if q* = q u and b'[q*\ < if g* = q±. Therefore, considering the linear expansion of q 
around q* , q = q* + Sq, and the first order relationship Sq ~ g|| — b'[q*]dq, we see that q u is unstable while q± 

are stable. 

Let us now turn to the computation of p_(go) = P(g_,r = oo | go, To = 0), i.e. the probability that the system 
reaches g_ starting from go in a time interval At = oo. As a function of go and To, p-(qo) satisfies a backward 
Fokker-Planck equation 

2 

%o] d q p-{q)\ q=qo + G — a[q ] c) 2 q P-{q)\ q=qa - (16) 



whose drift and diffusion coefficients 6 [go] and ay/a[q ] are given by (fT3| . (fl"4|) . (fT5|l evaluated at t = and whose 
boundary conditions are chosen to be absorbing, i.e. p~(qo)\ q _ q =1 and p_(go)| 
Defining 



1 3o=9+ 



ip(q) = cxp 



2 ^ ^ W 



r2 



1- 



a[q>] 



(17) 
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Figure 2: Average probability (q) of a herding agent making the right forecast at different fractions rj of herding agents for 
p = 0.55, K = 11 and TV = 200 (x) or 10 4 (+). 



the solution to Eq. (fl"6|) is given by p-(qo) = dq <p(q)/ dq <p(q). 

Therefore, from Eq. ([9]), the probability that a generic realization of the process converges to the fixed point g_ 
given by 



n Jo XT d( i <p(q) 



d 1o X T. .Z e-Wto-ir . (18) 



We numerically integrated l|18p with Gauss-Legendre method. Results are shown in Fig. [5] 
The analytical results are in excellent agreement with numerical simulations for finite N in the region around Nash 
equilibrium point (77 close to one). 

The lower accuracy of the analytic approach in the intermediate r] « 1/2 region is probably due to the larger dispersion 
of fixed points for finite N, which is neglected here. Note also that q± are not absorbing points of the dynamics as 
long as q± ^ or 1, because while the deterministic term vanishes the stochastic term of Eq. (| 14[) is non-zero. Again, 
for -q close to one, this effect is negligible as indeed g_ = and q + = 1. 



V. CONCLUSIONS 



In conclusion, we derived an accurate approximation for the stochastic dynamics of a simple model that emulates 
a phenomenon of information aggregation. A static analysis of the model suggests the presence of a phase transition, 
characterized by phase coexistence of three solutions, two stable and one unstable. We have shown that the dynamics 
can be described analytically as a Langevin equation and the problem of computing the probability to end up in 
one of the two stable equilibrium points can be cast in the form of a Kramers' problem. This approach describes 
excellently the system in the region where the vast majority of the agents herd, which is the most interesting one 
from the point of view of game theory, i.e. around the Nash equilibrium point. 

Open problems are a more fitting description of the system in region rj ^ rj* , the analysis of system's behaviour 
in case of (partially) symmetric interaction or the effects of contrarian agents (anti-ferromagnetic interaction). 
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